Method and apparatus for accessing an audio file from a collection of audio files using tonal matching

ABSTRACT

A method and apparatus for accessing at least one audio file from a collection comprising more than one audio file stored within or accessible with an electronic device. The method includes generating one index comprising information entries obtained from each of the more than one audio file in the collection, with each audio file in the collection information being linked to at least one information entry; receiving a vocal input during a voice reception mode; converting the vocal input into a digital signal using a digital-analog converter; analysing the digital signal using frequency spectrum analysis into discrete portions; and comparing the discrete portions with the entries in the index. It is advantageous that the audio file is accessed when the discrete portions substantially match at least one of the information entries in the index. It is preferable that the discrete portions are either musical notes or waveforms.

FIELD OF INVENTION

This invention relates to a method and apparatus for accessing an audiofile from a collection of audio files, and particularly relates to theaccessing of files using tonal matching.

BACKGROUND

The advent of the age of affordable digital entertainment has given riseto a sharp increase in the adoption of personal digital entertainmentdevices by consumers. Such personal digital entertainment devices areusually equipped with storage capacities of a range of sizes. Given thefalling prices of storage devices like hard drives and flash memory, anincreasing number of personal digital entertainment devices come withstorage capacities exceeding. 1 GB. Storage capacities of such sizes inpersonal digital entertainment devices used for audio files enable thestorage of hundreds and even thousands of files.

While the audio files may be stored and categorisable according to theirsong titles, artistes, genre or the like, there may be instances where auser may forget the title or artiste of a song, rendering a search forthe pertinent audio file akin to searching for a needle in a haystack.In many instances, the user may only be able to remember a portion ofthe song or its tune. At the present moment, this does not aid in thesearch for the pertinent audio file in any way. This is a problem whenattempting to access audio files in a large collection of audio fileswhere certain information like title or artiste of a song is unknown.This problem also arises when the visually impaired attempts to accessaudio files in a collection of audio files where they are unable toselect the audio files through the use of sight.

It is also rather difficult to improve one's vocal prowess withoutengaging expensive vocal coaches. It is currently difficult to improveone's vocal prowess independently besides using karaoke machines with“scoring” functionalities incorporated in them. There are currently fewdevices available which are able to determine the quality of one's vocalprowess easily and conveniently.

SUMMARY OF INVENTION

In a preferred aspect of the present invention, there is provided amethod for accessing at least one audio file from a collectioncomprising more than one audio file stored within or accessible with anelectronic device. The method includes generating one index comprisingof information entries obtained from each of the more than one audiofile in the collection, with each audio file in the collectioninformation being linked to at least one information entry; receiving avocal input during a voice reception mode; converting the vocal inputinto a digital signal using a digital-analog converter; analysing thedigital signal using frequency spectrum analysis into discrete portions;and comparing the discrete portions with the entries in the index. It isadvantageous that the audio file is accessed when the discrete portionssubstantially coincide with at least one of the information entries inthe index. It is preferable that the discrete portions are eithermusical notes or waveforms. The at least one information entry may alsobe musical notes or waveforms.

The vocal input may preferably be speaker independent and may be in theform of singing, humming, or whistling. The form of vocal input maypreferably be manually or automatically selectable.

It is preferable that the audio file is accessible from the electronicdevice itself, a device functionally connected to the electronic deviceor a connected computer network. The information entry may alsopreferably be received from the audio file, a pre-recorded vocal entrylinked to the audio file, or a connected computer network. It ispreferable that the electronic device is selected from the groupcomprising: vehicle audio system, desktop computer, notebook computer,PDA, portable media player and mobile phone.

It is also preferable that the method further includes selecting afacility to access the audio files by depressing a pre-determined buttonat least once, and filtering the vocal input.

There is also provided an apparatus for accessing at least one audiofile from a collection comprising more than one audio file stored withinor accessible with the apparatus. It is preferable that the apparatusincludes an indexer for generating an index comprising of informationentries obtained from each of the more than one audio files in thecollection, with each audio file in the collection information beinglinked to at least one information entry; a vocal reception means forreceiving a vocal input during a vocal reception mode; converting thevocal input into a digital signal using a digital-analog converter; anda processor to analyse the digital signal using frequency spectrumanalysis into discrete portions, the processor also being able tocompare the discrete portions with the entries in the index.Advantageously, the audio file is accessed when the discrete portionssubstantially coincide with at least one of the information entries inthe index. The apparatus may include a display and the vocal input maybe filtered. The vocal reception mode may be activated by depressing atleast one button at least once. It is preferable that the discreteportions are musical notes or waveforms.

It is preferable that the apparatus is selected from the groupcomprising: vehicle audio system, desktop computer, notebook computer,PDA, portable media player and mobile phone.

It is preferable that the vocal input is either manually orautomatically selected from the group comprising: singing, humming, andwhistling. Advantageously, the vocal input is speaker independent. Theat least one information entry may be selected from either musical notesor waveforms. Preferably, the at least one information entry is receivedfrom the audio file, a pre-recorded vocal entry linked to the audiofile, or a connected computer network. The audio file may be accessiblefrom the electronic device itself, any device functionally connected tothe electronic device or a connected computer network.

There is also provided a method of determining a level of quality forvocal input using the aforementioned apparatus.

DESCRIPTION OF DRAWINGS

In order that the present invention may be fully understood and readilyput into practical effect, there shall now be described by way ofnon-limitative example only preferred embodiments of the presentinvention, the description being with reference to the accompanyingillustrative drawings.

FIG. 1 shows a flow chart of a method of a preferred embodiment of thepresent invention.

FIG. 2 shows a schematic diagram of an apparatus of a preferredembodiment of the present invention.

DESCRIPTION OF PREFERRED EMBODIMENTS

The following discussion is intended to provide a brief, generaldescription of a suitable computing environment in which the presentinvention may be implemented. Although not required, the invention willbe described in the general context of computer-executable instructions,such as program modules, being executed by a personal computer.Generally, program modules include routines, programs, characters,components, data structures, that perform particular tasks or implementparticular abstract data types. As those skilled in the art willappreciate, the invention may be practiced with other computer systemconfigurations, including hand-held devices, multiprocessor systems,microprocessor-based or programmable consumer electronics, network PCs,minicomputers, mainframe computers, and the like. The invention may alsobe practiced in distributed computing environments where tasks areperformed by remote processing devices that are linked through acommunications network. In a distributed computing environment, programmodules may be located in both local and remote memory storage devices.

Referring to FIG. 1, there is provided flow chart of a method foraccessing at least one audio file from a collection comprising more thanone audio file stored within or accessible with an electronic device.The electronic device may be, for example, a vehicle audio system, adesktop computer, a notebook computer, a PDA, a portable media player ora mobile phone and the like. The method may include an enablement of avocal reception mode (20) in the electronic device in a manner like, forexample, depressing a pre-determined button on the electronic device atleast once. The vocal reception mode may be enabled or disabled as itmay prevent a power source in the electronic device from beingcontinually drained by continual enablement of the vocal reception mode.The vocal reception mode may be for vocal input such as, for example,singing, humming, or whistling.

The enablement of the vocal reception mode in the electronic device mayinitialise an indexing system (24). Once the indexing system isinitiated, the system then determines whether the composition of audiofiles in the collection has changed (26). The composition of audio filesmay include the number of audio files and the audio filenames. The indexmay comprise information entries obtained from each of the more than oneaudio file in the collection of audio files stored in the electronicdevice, any device functionally connected to the electronic device or aconnected computer network. Connection to the computer network may bevia wired or wireless means. Each audio file in the collection may belinked to at least one information entry in the index. The at least oneinformation entry may be musical notes or waveforms determined usingsemantic segmentation corresponding to a portion or the whole contentstored in the audio files. The information entry may also be a MIDIcomponent that is linked/attached to an audio file like file metadata.The information entry may also be obtainable from a pre-recorded vocalentry linked/attached to the audio file, or a connected computernetwork. There may be an online database on the connected computernetwork where information entries of musical notes or waveforms aredownloadable for each audio file.

If the composition of audio files is found to be different, a search isconducted on the collection of audio files stored in the electronicdevice, any device functionally connected to the electronic device or aconnected computer network (28). This step is to determine whether audiofiles have been added to or removed from the collection. Subsequent tothe search, information entries obtained from each audio file directly(25), information entries downloaded from the connected computer networkfor each audio file (29), or pre-recorded vocal entries linked to eachaudio file (23) may be combined into an index (30). The index is thenloaded for use (32) in the electronic device.

If the composition of audio files is found to be unchanged, the lastused index is then loaded for use (32) in the electronic device. Withthe enablement of the vocal reception mode, there may be vocal inputinto the device (34). The vocal input may be singing, humming, orwhistling. In a particular instance, the vocal input need not be a songin its entirety. A portion of a song may be sufficient as a viable formof the vocal input. The vocal input may be filtered. A user may be ableto manually select a specific vocal input (22) for the vocal receptionmode. There may also be automatic detection of vocal input (22). Vocalreception by the electronic device may be speaker independent. The vocalreception mode may have automatic volume correction for the vocal inputif the vocal input is either too loud (such that distortion of inputoccurs) or too soft (such that input is inaudible). The electronicdevice may also be able to overcome the problem of an off tune vocalinput during the vocal input mode by providing a selection of audiofiles that most closely approximates to the off tune vocal input basedon the entries of the audio files in the index. The user may set thedevice to show the closest approximations up to a pre-determined number,such as, for example, the ten closest approximations.

Subsequently, the vocal input in analog form is converted into digitalsignals by a digital-analog converter (36). The converter may be ananalog-MIDI converter. Thereafter, a processor in the electronic devicemay analyse the digital signals into discrete portions, where thediscrete portions may be either musical notes or waveforms. Processingof the digital signals may be done using frequency spectrum analysis.The processor may then compare the discrete portions with entries in theindex (40). Exact or substantial similarity between the discreteportions and entries in the index enables the generation of a listing ofaudio files in order of extent of similarity (42). The listing may showa number of audio files, a number that may be pre-determined by the userand may be shown on a display on the electronic device. The extent ofsimilarity may be based on relative closeness in terms of either musicalnotes or waveforms.

Referring to FIG. 2, there is provided an apparatus 50 for accessing atleast one audio file from a collection comprising more than one audiofile stored within or accessible with the apparatus 50. The apparatus 50may be for example, a vehicle audio system, desktop computer, notebookcomputer, PDA, portable media player or mobile phone. The componentsdescribed in the following sections may be incorporated in theaforementioned different forms of the apparatus 50 in addition tocomponents used for their primary functionalities.

The apparatus 50 may include a digital storage device 58 for the storageof the audio files that make up the collection of files. The digitalstorage device 58 may be non-volatile memory in the form of a hard diskdrive or flash memory. The digital storage device 58 may have capacitiesof at least a few megabytes.

In addition, the apparatus 50 may also include an indexer 56 forgenerating an index comprising of information entries obtained from eachof the more than one audio files in the collection. The index maycomprise information entries obtained from each of the more than oneaudio file in a collection of audio files stored in the digital storagedevice 58 of the apparatus 50, any device functionally connected to theapparatus 50 or a connected computer network. Each audio file in thecollection may be linked to at least one information entry in the index.The at least one information entry may be musical notes or waveformsdetermined using semantic segmentation corresponding to a portion or thewhole content stored in the audio files. The information entry may alsobe a MIDI component that is linked/attached to an audio file like filemetadata. The information entry may also be obtainable from apre-recorded vocal entry linked/attached to the audio file, or aconnected computer network. There may be an online database on theconnected computer network where information entries of musical notes orwaveforms are downloadable for each audio file.

A vocal reception means 64 for receiving a vocal input during a vocalreception mode may also be included in the apparatus 50. The vocalreception means 64 may be a microphone. The vocal input may be singing,humming, or whistling. In a particular instance, the vocal input neednot be a song in its entirety. A portion of a song may be sufficient asa viable form of the vocal input. The vocal input may also be filtered.There may be a selector to choose the type of vocal input, or detectionof vocal input may be automatic. The vocal reception mode may beactivated by pressing an activating button 63 incorporated with theapparatus 50 at least once. Vocal input into the vocal reception means64 may be speaker independent. The vocal reception mode may haveautomatic volume correction for the vocal input if the vocal input iseither too loud (such that distortion of input occurs) or too soft (suchthat input is inaudible). The electronic device may also be able toovercome the problem of an off tune vocal input during the vocal inputmode by providing a selection of audio files that most closelyapproximates to the off tune vocal input based on the entries of theaudio files in the index. The user may set the device to show theclosest approximations up to a pre-determined number, such as, forexample, the ten closest approximations.

The vocal reception means 64 may be coupled to a digital-analogconverter 62 which converts the vocal input through the vocal receptionmeans 64 into digital signals. The converter 62 may be an analog-MIDIconverter. The converted digital signals are then passed into aprocessor 60 for analysis of the digital signals into discrete portions,where the discrete portions may be either musical notes or waveforms.Processing of the digital signals by the processor 60 may be done usingfrequency spectrum analysis. The processor 60 may then be able tocompare the discrete portions of the signals with the entries in theindex generated by the indexer 56. Audio files may thereby be accessiblewhen the discrete portions substantially coincides with at least one ofthe information entries in the index. Exact or substantial similaritybetween the discrete portions and entries in the index enable thegeneration of a listing of audio files in order of extent of similarity.The listing may show a number of audio files, a number that may bepre-determined by the user. A display 54 in the apparatus 50 allows forthe listing of files to be shown clearly for selection by the user. Theextent of similarity may be based on relative closeness in terms of,either musical notes or waveforms.

The visually impaired may be able to use apparatus 50 to access filesstored within or accessible with the apparatus 50 using tonal matching.While they are unable to select the files shown on the display 54, theymay access the audio file which has been extracted from the collectionat their convenience just from using vocal input.

An alternative application of the present invention makes use of thevocal reception mode of the electronic device to ascertain and improvevocal abilities of users. For example, if a user repeatedly fails tofind a desired audio file through the use of vocal input into theelectronic device, it is highly probable that the user's vocal input(prowess) is flawed. Thus the user is then inclined to continuallypractice vocal input into the electronic device until improvement isattained in terms of a higher incidence of finding a desired audio file.Thus, a device to conveniently ascertain a level of quality for vocalinput is also disclosed.

Whilst there has been described in the foregoing description preferredembodiments of the present invention, it will be understood by thoseskilled in the technology concerned that many variations ormodifications in details of design or construction may be made withoutdeparting from the present invention.

1. A method for accessing at least one audio file from a collectioncomprising more than one audio file stored within or accessible with anelectronic device, including: generating one index comprising ofinformation entries obtained from each of the more than one audio filein the collection, with each audio file in the collection being linkedto at least one information entry; receiving a vocal input during avoice reception mode; converting the vocal input into a digital signalusing a digital-analog converter; analysing the digital signal usingfrequency spectrum analysis into discrete portions; and comparing thediscrete portions with the information entries in the index, wherein theat least one audio file is accessed when the discrete portionssubstantially match at least one information entry in the index.
 2. Themethod of claim 1, wherein the discrete portions are selected from thegroup consisting of: musical notes and waveforms.
 3. The method of claim1, wherein the vocal input is selected from the group consisting of:singing, humming, and whistling.
 4. The method of claim 1, wherein theat least one information entry is selected from the group consisting of:musical notes and waveforms.
 5. The method of claim 1, wherein the audiofile accessible from a source selected from the group consisting of: theelectronic device, any device functionally connected to the electronicdevice and a connected computer network.
 6. The method of claim 3,wherein the vocal input is set by means selected from the groupconsisting of: manual selection and automatic selection.
 7. The methodof claim 1, wherein the vocal input is speaker independent.
 8. Themethod of claim 1, wherein the at least one information entry isreceived from a source selected from the group consisting of: the audiofile, a pre-recorded vocal entry linked to the audio file, and aconnected computer network.
 9. The method of claim 1, wherein theelectronic device is selected from the group consisting of: vehicleaudio system, desktop computer, notebook computer, PDA, portable mediaplayer and mobile phone.
 10. The method of claim 1, further includingselecting a facility to access the audio files by depressing apre-determined button at least once.
 11. The method of claim 1, furtherincluding filtering the vocal input.
 12. An apparatus for accessing atleast one audio file from a collection comprising more than one audiofile stored within or accessible with the apparatus, including: anindexer configured to generate an index comprising information entriesobtained from each of the more than one audio files in the collection,with each audio file in the collection being linked to at least oneinformation entry; a vocal receiver configured to receive a vocal inputduring a vocal reception mode; a digital signal using a digital-analogconverter configured to convert the vocal input into a digital signal;and a processor configured to analyse the digital signal using frequencyspectrum analysis into discrete portions and to compare the discreteportions with the information entries in the index, wherein the at leastone audio file is accessed when the discrete portions substantiallymatch at least one information entry in the index.
 13. The apparatus ofclaim 12, wherein the apparatus is selected from the group consistingof: vehicle audio system, desktop computer, notebook computer, PDA,portable media player and mobile phone.
 14. The apparatus of claim 12,wherein the vocal input is selected from the group consisting of:singing, humming, and whistling.
 15. The apparatus of claim 14, whereinthe vocal input is set by means selected from the group consisting of:manual selection and automatic selection.
 16. The apparatus of claim 12,wherein the at least one information entry is selected from the groupconsisting of: musical notes and waveforms.
 17. The apparatus of claim12, wherein the vocal input is speaker independent.
 18. The apparatus ofclaim 12, wherein the at least one information entry is received from asource selected from the group consisting of: the audio file, apre-recorded vocal entry linked to the audio file, and a connectedcomputer network.
 19. The apparatus of claim 12, wherein the vocalreception mode is activated by depressing at least one button at leastonce.
 20. The apparatus of claim 12, further including a display. 21.The apparatus of claim 12, wherein the vocal input is filtered.
 22. Theapparatus of claim 12, wherein the discrete portions are selected fromthe group consisting of: musical notes and waveforms.
 23. The apparatusof claim 12, wherein the audio file is accessible from a source selectedfrom the group consisting of: the electronic device, any devicefunctionally connected to the electronic device and a connected computernetwork.
 24. A method of determining a level of quality for vocal inputusing the apparatus of claim
 12. 25. A method for accessing at least oneaudio file from a collection of audio files stored within or accessiblewith an electronic device, the method comprising: generating an indexcomprising information entries obtained from audio files in thecollection, each audio file in the collection having at least onecorresponding information entry in the index; analysing a digital signalinto discrete portions, the digital signal being obtained from aconverted vocal input received during a voice reception mode; andcomparing the discrete portions with the information entries in theindex, wherein the at least one audio file is accessed when the discreteportions substantially match at least one information entry in theindex.
 26. The method according to claim 25, wherein the digital signalis analysed into discrete portions using frequency spectrum analysis.27. The method according to claim 25, wherein the vocal input isconverted into the digital signal using a digital analog converter.